Unequal group variances in microarray data analyses

نویسندگان

  • Meaza Demissie
  • Barbara Mascialino
  • Stefano Calza
  • Yudi Pawitan
چکیده

MOTIVATION In searching for differentially expressed (DE) genes in microarray data, we often observe a fraction of the genes to have unequal variability between groups. This is not an issue in large samples, where a valid test exists that uses individual variances separately. The problem arises in the small-sample setting, where the approximately valid Welch test lacks sensitivity, while the more sensitive moderated t-test assumes equal variance. METHODS We introduce a moderated Welch test (MWT) that allows unequal variance between groups. It is based on (i) weighting of pooled and unpooled standard errors and (ii) improved estimation of the gene-level variance that exploits the information from across the genes. RESULTS When a non-trivial proportion of genes has unequal variability, false discovery rate (FDR) estimates based on the standard t and moderated t-tests are often too optimistic, while the standard Welch test has low sensitivity. The MWT is shown to (i) perform better than the standard t, the standard Welch and the moderated t-tests when the variances are unequal between groups and (ii) perform similarly to the moderated t, and better than the standard t and Welch tests when the group variances are equal. These results mean that MWT is more reliable than other existing tests over wider range of data conditions. AVAILABILITY R package to perform MWT is available at http://www.meb.ki.se/~yudpaw

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Can Unequal Residual Variances Across Groups Mask Differences in Residual Means in the Common Factor Model?

Equality of residual variances across groups is one of the necessary conditions of measurement invariance. The main argument for not applying this restriction in the analysis of empirical data is that unequal residual variances across groups are differences in reliability of the observed variables rather than a violation of measurement invariance. A power study is carried out to investigate the...

متن کامل

Quality optimised analysis of general paired microarray experiments.

In microarray experiments, several steps may cause sub-optimal quality and the need for quality control is strong. Often the experiments are complex, with several conditions studied simultaneously. A linear model for paired microarray experiments is proposed as a generalisation of the paired two-sample method by Kristiansson et al. (2005). Quality variation is modelled by different variance sca...

متن کامل

A Parametric Bootstrap Approach for One-Way ANOVA Under Unequal Variances with Unbalanced Data

This research is to provide a solution of one-way ANOVA without using transformation when variances are heteroscedastic and group sizes are unequal. Parametric boothstrap test (Krishnamoorthy, Lu, & Mathew, 2007) has been shown to be competitive with many other methods when testing the equality of group means. We extend the parametric bootstrap algorithm to a multiple comparison procedure. Simu...

متن کامل

Weighted analysis of paired microarray experiments.

In microarray experiments quality often varies, for example between samples and between arrays. The need for quality control is therefore strong. A statistical model and a corresponding analysis method is suggested for experiments with pairing, including designs with individuals observed before and after treatment and many experiments with two-colour spotted arrays. The model is of mixed type w...

متن کامل

A comparative review of estimates of FDR in small microarray experiments

The focus of this paper is to illustrate and compare three recently suggested methods for the identification of differentially expressed genes and estimation of the false discovery rate (FDR) and to inversigate whether these FDR estimation methods are biased when we have a small sample size of 3 subjects per group. The methods are estimation of FDR based on averaging of fdr1d (FDR.avg), FDR est...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 24 9  شماره 

صفحات  -

تاریخ انتشار 2008